Competition of individual and institutional punishments in spatial public goods games 
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We have studied the evolution of strategies in spatial public goods games where both individual (peer) and 
institutional (pool) punishments are present beside unconditional defector and cooperator strategies. The evo- 
lution of strategy distribution is governed by imitation based on random sequential comparison of neighbors' 
payoff for a fixed level of noise. Using numerical simulations we have evaluated the strategy frequencies and 
phase diagrams when varying the synergy factor, punishment cost, and fine. Our attention is focused on two 
extreme cases describing all the relevant behaviors in such a complex system. According to our numerical data 
peer punishers prevail and control the system behavior in a large segments of parameters while pool punishers 
can only survive in the limit of weak peer punishment when a rich variety of solutions is observed. Paradoxi- 
cally, the two types of punishment may extinguish each other's impact resulting in the triumph of defectors. The 
technical difficulties and suggested methods are briefly discussed. 

PACS numbers: 89.65.-.S. 89.75.Fb, 87.23.Kg 



I. INTRODUCTION 

The emergence of cooperation among selfish individuals 
is an important and intensively studied puzzle inspired by 
systems of biology, sociology, or economics [1, 2]. One of 
the frequently used framework to catch the conflict of indi- 
vidual and common interests is the so-called public goods 
game (PGG) in which several players decide simultaneously 
whether they contribute or not to the common venture. The 
collected income is multiplied by a factor (representing the 
advantage of collective actions) and shared equally among 
all members of the group independently of their personal act. 
Accordingly, defectors, who deny to contribute but enjoy the 
common benefit (due to the cooperators), collect higher in- 
dividual payoffs and are favored leading to the "tragedy of 
commons" state [3]. 

In the last decade several mechanisms have already been 
identified that help resolve this dilemma by ensuring com- 
petitive payoff for altruistic (cooperative) players [4-19]. A 
plausible idea is to punish defectors by lowering their income 
which decreases their popularity [20-23]. To punish cheaters, 
however, can be executed in two significantly different ways. 

Firstly, players can retaliate individually by paying extra 
cost of punishment as often as they face with defectors. Nat- 
urally, this so-called peer punisher strategy fares equally well 
with pure cooperators in the absence of cheaters. The pure 
cooperators, however, who do not contribute to the sanctions 
but utilize the advantage of punishment, can be considered 
as "second-order free-riders" [24]. As a conclusion, the gen- 
erally less favored peer punisher strategy will become extinct 
gradually and the original problem emerges again. Without in- 
troducing further complexity this problem cannot be solved in 
well-mixed population. In structured population, however, an 
adequate solution may be achieved by utilizing spatial effects 
[2, 5]. Here the pure cooperators and peer punishers are able 
to separate from each other and fight independently against 
defectors. Since punishers do it more successfully, they even- 
tually displace the pure cooperators via an indirect territorial 
fight [25, 26]. 

The alternative way to impose sanctions is when players 



invest a permanent cost into a punishment pool and punish 
defectors "institutionally". In this case, if there is punish- 
ment in the group, the fine imposed on defectors may not 
necessarily depend on the actual number of punishers in the 
group and the cost of punishers can also be independent on 
the number of cheaters among group members. In this way 
the cost is always charged independently of the necessity or 
efficiency of punishment. In well-mixed population pool pun- 
ishment can only prevail if "second-order punishment" is al- 
lowed, i.e. pure cooperators, who do not invest extra cost into 
the punishment pool, are also fined [27, 28]. In the absence 
of the latter possibility defectors will spread if the participa- 
tion in PGG is compulsory. In agreement with the expecta- 
tions the spatial models offer another type of solutions where 
the pool punisher strategy can survive without assuming ad- 
ditional punishment of pure cooperators. In the latter case a 
self-organizing spatio-temporal pattern can be observed [29]. 
The emergence of spatial patterns, maintained by cyclic dom- 
inance among three strategies, is a general phenomenon and 
occurs for a wide variety of systems including PGG [4, 30, 31] 
and different variants of prisoner's dilemma game [32-36]. 

We note that many aspects of punishment were already in- 
vestigated in human experiments [21, 37-43], as well as by 
means of mathematical models with three [4, 44, 45], four 
[46, 47], and even more strategies [48, 49]. 

The seminal work of Sigmund et al. has revealed that pool 
punishers always lose and peer punishers prevail for well- 
mixed populations in the absence of second-order punishment 
[27, 28] . In the present paper we study the competition of pun- 
ishing strategies by assuming structured population. It will be 
demonstrated that the stable solution depends sensitively on 
the relative cost that punishing strategies bear Accordingly, 
we have studied two extreme cases illustrating the possible re- 
lations of pool and peer punisher players. It should be stressed 
that in our model additional strategies, such as voluntary op- 
tional participation in PGG or second-order punishment of 
pure cooperators, are not allowed. Despite its simplicity the 
spatial model exhibits really complex behavior including dif- 
ferent space and time scales in connection to the emerging 
solutions. 
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The remainder of this paper is organized as follows. In the 
next section we describe the studied models by supplying mo- 
tivations to the suggested extreme cases termed as "hard" and 
"weak" peer punishment hmits. In Sec. Ill we present the 
solutions obtained by Monte Carlo (MC) simulations for an 
expensive peer strategy. The results of the other extreme case 
are presented in Sec. IV. Finally, we sunmiarize our observa- 
tions and discuss their potential implications. 



n. SPATIAL PUBLIC GOODS GAME WITH PUNISHING 
STRATEGIES 

To preserve comparabiUty with previous works [26, 29] the 
public goods game is staged on a square lattice using periodic 
boundary conditions. We should emphasize, however, that the 
observed results are robust and are valid in a wider class of 
two-dimensional lattices. The players are arranged into over- 
lapping five-person (G = 5) groups in a way that each player 
at site X serves as a focal player in the group formed together 
with his/her four nearest neighbors [50-54]. Consequently, 
each individual belongs to G = 5 different groups and plays 
five five -person games by following the same strategy in every 
group he/she is affiliated with. 

According to the four possible strategies, a player on site 
X is designated as a defector (sx = D), or pure cooperator 
(Sx = C), or peer (sx = E), or pool punisher (sx = O). For 
the last three strategies the player contributes a fixed amount 
(equals to 1 without loss of generahty) to the pubhc goods 
while defectors contribute nothing. The sum of all contribu- 
tions in each group is multiplied by the factor r (1 < r < G), 
reflecting the synergetic effects of cooperation, and the multi- 
plied investment is divided equally among the group members 
irrespective of their strategies. 

In addition to the basic game, defectors may be punished if 
there are pool or peer punisher players in the group. Pool pun- 
ishment requires precursory allocation of resources, that is, 
each punisher contributes a fixed amount 7 to the punishment 
pool irrespective of the strategies in its neighborhood. Fur- 
thermore, because of the institutional character of this sanc- 
tion, the resulting /3 fine of defectors is independent on the 
frequency of pool punishers: the only criterion is to presence 
at least one pool punisher in the group. 

The character of peer and pool punishments differ signifi- 
cantly. Namely, the cost of peer punishment is charged only 
if a peer punisher faces with a defector but this cost is multi- 
plied by the number of defectors (Nfy) in the given group g 
(g = 1, . . . , G). The latter fact reflects that a peer punisher 
should penalize every defectors individually. In addition, the 
fine of a defector originated from peer punishment is accu- 
mulated and is proportional to the number of peer punishers 
in the group. Denoting the number of cooperators and 
pool punishers by and Nq in the group, the payoff for the 
possible strategies can be given as 

P(') = PS- 7, (1) 
P^^ ^ Pg- ^mNfj, 



Pf, = r{Nl, + N^ + N'^)/G-f3mNl,-pf{N^), 

where the step-like function f{Z) is 1 if Z > and oth- 
erwise. The total payoff of a player at site x is accumulated 
from five public goods games, consequently, Pg^ = 
ig=l,...,G). 

The parameter m in Eqs. (1) allows us to quantify two rele- 
vant limits in the relation of pool and peer punisher strategies. 
At the "hard" limit of peer punishment (m = 1) the pool pun- 
isher pays a lump cost 7 while the peer punisher is charged by 
the same cost 7 for each action of punishment, that is their cor- 
responding income is reduced by 7 A^|) . Notice, that in spite of 
their high cost the peer punisher may overcome pool punish- 
ers in the absence of defectors. The latter constellations may 
become relevant in the spatial systems if defectors are present 
rarely. On the other hand, the hard peer punishment reduces 
the income of defectors more efficiently if several neighbors 
apply this strategy against a defector. 

The "weak" limit of the peer punishment will be studied at 
the parameter value m = 1/(G — 1). In this case the cost of a 
pool punisher always exceeds the cost of a peer punisher ex- 
cepting the case when every group member chooses defection 
around the E player. Now we consider only the case when 
the efficiency (i.e., corresponding fine) of peer punishment is 
also reduced by the factor m. The above situations raise many 
questions about the competition and coexistence of the basi- 
cally different types of punishment. 

Following the traditional concept of evolutionary game the- 
ory, the population of the more successful individual strate- 
gies expands at the disadvantage of others having lower in- 
come (fitness). For networked population this strategy update 
is usually performed via a stochastic imitation of the more 
successful neighbors. Accordingly, during an elementary step 
of Monte Carlo simulation a randomly selected player x plays 
public goods games with her all co-players in G groups and 
collects Pg^ total payoff as described in Eqs. (1). Next, player 
X chooses one of its four nearest neighbors at random, and the 
chosen co-player y also acquires its payoff Pg^^ in the same 
way. Finally, player x imitates the strategy of player y with a 
probability w{sx -5- Sy) = 1/{1 + exp[(P^^ - P^J/K]}, 
where K quantifies the uncertainty in strategy adoptions 
[2, 55]. Generally, the possibility of error in strategy update 
prevents the system from being trapped in a frozen, metastable 
state. For the sake of direct comparison with previous results 
[26, 29] we set JsT = 0.5. It is emphasized that the found solu- 
tions are robust and remain vaUd at other (low) values of noise 
parameter. 

The frequencies of pool and peer punishers (po and ps), 
cooperators (pc) and defectors (po) [satisfying the condition 
pD + PC + Po + pE = 1] are determined by averaging over 
a sampling time tg after a sufficiently long relaxation time tr- 
The time is measured in the unit of Monte Carlo step (MCS) 
giving a chance once on average for the players to adopt one 
of the neighboring strategies. Depending on the values of the 
parameters 7, (3, and r the emerging spatial patterns exhibit 
a large variety in the characteristic length and time scales. In 
order to achieve an adequate accuracy (typically the line thick- 
ness) we need to vary the linear system size from L = 400 to 
7200 for sufficiently long sampling and relaxation times (in 
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some crucial cases tr = tg > 10^ MCS). As we will describe 
in detail in the subsequent sections the usual choice of random 
distribution of strategies as an initial state was not always ap- 
propriate to find the solution that is valid in the large system 
size limit. At some parameter values even the largest attain- 
able system size, {L = 7200), was not large enough to reach 
the most stable solution from a random initial state. This prob- 
lem is related to the fact that the formation of some solutions is 
characterized by different time scales and the fast relaxation 
from a random state toward an intermediate (unstable) state 
prevents the more complex solutions to emerge. In the lat- 
ter cases we had to use prepared (artificial) initial state (e.g., 
a patch-work-like pattern) combining solutions of subsystems 
where several strategies are missing. 



III. HARD PEER PUNISHMENT 

First we discuss the case of hard peer punishment because 
it yields simpler phase diagrams. In this case the cost of peer 
punishers exceeds the cost of pool punishers when several de- 
fectors are present in their neighborhood. At the same time 
the peer punishers can help each other if they form compact 
colonies in the spatial system and these collaborations mul- 
tiply the fine reducing the income of neighboring defectors. 
To reveal the possible stable solutions we have studied differ- 
ent values of synergy factor r exhibiting significantly differ- 
ent results in simpler models studied previously [26, 29]. The 
applied values (r — 3.8, 3.5, and 2) represent three different 
classes in the stationary behavior 

The highest synergy factor (r = 3.8) allows pure coopera- 
tors to survive even in the absence of punishment. At a slightly 
lower synergy value (r = 3.5) defectors would prevail with- 
out punishment, however, both types of punishment (as a pos- 
sible third strategy) can boost cooperation as it was already 
shown [26, 29]. In case of the lowest synergy factor (r — 2), 
the simpler three-strategy models predict significantly differ- 
ent behaviors when applying only peer or pool punishment. 
For low cost values cooperators were unable to survive for 
the case of peer punishment in the presence of a weak noise 
allowing additional rare creation of defectors. On the con- 
trary, for pool punishment, the D, C, and O strategies formed 
a self-organizing spatial pattern maintained by cyclic domi- 
nance. Now the numerical analysis is extended for higher val- 
ues of (3 and 7. As a result, we have observed the coexistence 
of D, C, and E strategies via a curious mechanism within a 
region of parameters (not yet investigated previously). 

MC simulations were performed to determine the station- 
ary frequency of strategies when varying the value of fine (3 
for different values of cost 7 and r. The numerical data indi- 
cated discontinuous (first-order) or continuous (second-order) 
phase transition(s) between phases characterized by basically 
different compositions and/or spatio-temporal structures as il- 
lustrated in Fig. 1 for the lowest value of r we first study. 

If the system is started from a random initial state with four 
strategies then the system evolves into the homogeneous (ab- 
sorbing) state D where only defectors remain alive (pn = 1) 
if the fine is smaller than a threshold value f3th{l — 0.8, r — 
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FIG. 1: (Color online) Average strategy frequencies v^. fine in the 
final stationary state for hard peer punishment limit (m = 1) at 7 = 
0.8 and r = 2. The corresponding phases are denoted at the top. 
Lines are just to guide the eye. The arrow points to the value of fine 
separating the phases D and where the average invasion velocity 
between the domains of D and E strategies becomes zero. 



2,K = 0.5) = 0.997(4) [indicated by an arrow in Fig. 1]. 
The simulations show clearly that defectors invade the terri- 
tories of peer punishers if /? < fith- For j3th < P < Pel 
the superiority of defectors {pD = 1) is due to a mecha- 
nism that can be understood by considering first the curi- 
ous coexistence of the D, C, and E strategies occuring for 
/3ci < /3 < M Wcih - 0.8, r = 2,K ^ 0.5) = 1.48(1) 
and /3c2h = 0.8, r = 2,K ^ 0.5) 2.10(1)]. The cor- 
responding phase is denoted as (Dh-Ch-E). Within this phase 
strategy E can invade the territories of Ds along the inter- 
faces separating them as illustrated in the snapshot in Fig. 2. 
For sufficiently high values of /3 and 7, however, the expensive 
action of punishment reduces the income of both defectors and 
peer punishers along the interface where players can increase 
their payoff by choosing cooperation. As a result cooperators 
can spread along these interfaces by forming a "monolayer". 
At the same time the interfacial cooperators serve as a "coop- 
erator reservoir" from where cooperation can spread into the 
phase E via the mechanism described by the voter model [56- 
58]. Rarely the cooperators aggregate in the vicinity of the 
interface and the given territory becomes unprotected against 
the invasion of defectors. Consequently, the presence of co- 
operators along the D-E interfaces reverses the direction of 
invasion. In the snapshot of Fig. 2 one can observe both types 
of invasions balanced in the (Dh-Ch-E) phases. 

The spreading of cooperators along the D-E interfaces is 
influenced by the values of 7 and (3 and it may become so ef- 
ficient that C monolayers are formed throughout these inter- 
faces. In that case the E domains are invaded by defectors with 
the assistance of cooperators. Having the last peer punishers 
removed the defectors sweep out cooperators, too. This pro- 
cess is resembling a real life situation referred as "The Moor 
has done his duty, the Moor may go". Such a scenario occurs 
within the phase D/j where subscript h refers to homoclinic 
instability. The mentioned transient process to D is confirmed 



4 



FIG. 2: (Color online) Typical arrangement of cooperators (white), 
defectors (black) and peer punishers (orange - light gray) for the 
(D+C+E) phase within a 200 x 200 part of a larger system at r = 2, 
7 = 1, and /3 — 2.5 in the hard peer punisment limit. 



by MC simulations for most of the runs in small systems (e.g., 
L < 400). The present system, however, can evolve into the 
homogeneous state E (pE = 1) with a probability increasing 
with L. The phase E can conquer D (via a nucleation mecha- 
nism) if a small colony of E players survive the extinction of 
cooperators and the colony size exceeds a critical value during 
the stochastic evolutionary steps. It is emphasized that the E 
invasion can be reversed by the offspring of a single coopera- 
tor substituted for one of the players along the D-E interface 
and finally the system evolves into a state prevailed by defec- 
tors. Notice that pool punishers die out for all the cases plotted 
in Fig. 1. Furthermore, the (D+C+E) phases transform into E 
with a continuous extinction of both the D and C strategies 
when approaching /3c2- Similar numerical investigations are 
made for many other values of cost 7 and the results are sum- 
marized in a phase diagram plotted in Fig. 3). 

The simulations indicate that both defectors and pool pun- 
ishers die out within a transient time for sufficiently high val- 
ues of /3 if 7 < 7c(r = 2) = 0.59(1). As the surviving 
cooperators and peer punishers receive the same payoff there- 
fore the resultant two-strategy evolutionary process becomes 
equivalent to those described by the voter model. The two- 
dimensional voter model exhibits an extremely (logarithmi- 
cally) slow evolution toward one of the (homogeneous) ab- 
sorbing states [59]. The coexistence of C and E strategies, 
however, can be destroyed by introducing defectors (as mu- 
tants even for arbitrarily small rates) favoring and accelerat- 
ing the fixation in the homogeneous state of E strategy [60] . 
This is the reason why the final stationary state is denoted by 
E in the phase diagrams throughout the whole paper (see e.g.. 
Figs. 1 and 3). Finally we mention that the dotted line in Fig. 3 
is the analytical continuation of the dashed (red) one separat- 
ing the phases D and E. Along these lines the average velocity 
of invasion between the phases E and D becomes zero. 
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FIG. 3: (Color online) Cost-fine phase diagram in the hard peer pun- 
ishment limit (m = 1) for a low synergy factor (r = 2). The dashed 
(red) and solid (blue) lines represent first- and second-order phase 
transitions, dotted (black) line separates the homogeneous phases of 
defectors (D) with different stabilities. 



As expected, the increase of r supports the maintenance of 
cooperation. Consequently, a smaller fine is capable to sup- 
press defection. As well as for r — 2 pool punishers die out 
quickly if r = 3.5. For high values of f3 and 7 the cooperators 
prefer staying along the interfaces separating domains of D 
and E phases (as described above) and yield a slower tendency 
toward the final stationary state. The undesired technical dif- 
ficulty is reduced significantly for lower values of cost and 
fine where Fig. 4 illustrates a discontinuous (first-order) phase 
transition between the phases D and E at a threshold value of 
fine increasing with the cost of peer punishment if these quan- 
tities exceed the suitable critical values (/3c — 0.13(1) and 
7c = 0.19(1) for r — 3.5 and K = 0.5). When increasing f3 




FIG. 4: (Color online) Cost-fine phase diagram at m = 1 for r = 
3.5. Solid (blue) and dashed (red) lines represent second-order and 
first-order phase transitions, respectively. D+E denotes a phase with 
coexisting D and E strategies. 

for 7 < 7c the first order phase transition from the homoge- 
neous D state to E is separated by a coexistence region of D 
and E strategies. Within this phase the frequency of peer pun- 
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ishers varies continuously from to 1 and both transitions ex- 
hibit the general features of directed percolation universality 
class in agreement with previous results obtained for imitation 
dynamics [32, 55, 61]. 

For higher synergy factors (e.g., r = 3.8) the cooperators 
survive in the absence of punishment (7 = 0). Consequently, 
the homogeneous D phase is missing in the phase diagram. In 
Figure 5 the phase D+C refers to the coexistence of coopera- 
tors and defectors in the final stationary states. The increase 



can fully displace not only pure cooperators but also peer pun- 
ishers who both can be considered as second-order free-riders. 




FIG. 5: (Color online) Cost-fine phase diagram at m = 1 for r = 
3.8. Phases and phase boundaries are denoted as in Fig. 4. 

of fine yields a discontinuous transition from D+C to D+E if 
7 < 0.21(1) for the given parameters, otherwise one can ob- 
serve a first-order transition from D+C to E (within the region 
of 7 and (3 plotted in Fig. 5). Within the D+E phase the den- 
sity of defectors vanishes continuously when approaching the 
phase boundary separating the phases D+E and E. 



IV. WEAK PEER PUNISHMENT 

In this section we focus on the opposite limit where during 
the sanction of punishment peer punishers have less cost and 
enforce lower fine in comparison with those of pool punish- 
ers. Using TO = 1/(G — 1) parameter value, their costs are 
equal only if a peer punisher is surrounded only by defectors 
{Nfy = G — 1). According to a naive argument, the peer pun- 
ishers might benefit from the powerful fine of pool punishers 
which strengthen their position further comparing to the latter 
strategy. This is expected especially after the experience what 
we observed in the previous section where peer punisher play- 
ers prevail the system despite of their large extra cost. Follow- 
ing the established protocol, we explore the possible solutions 
at three representative synergy factors. 

At high (r = 3.8) synergy factor the phase diagram, plot- 
ted in Fig. 6, partly supports our expectation. Namely, at high 
cost (7 > 0.0253) the solutions become identical to those ob- 
tained in the absence of O strategies. At low values of cost, 
however, the above mentioned belief is broken because pool 
punishers can survive despite that they are charged by a larger 
permanent cost of punishment. Notice furthermore, that they 
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FIG. 6: (Color online) Phase diagram for the weak peer punishment 
limit (m = 1/(G - 1)) at r = 3.8. Solid (blue) and dashed (red) 
lines represent second- and first-order phase transitions. 

Figure 7 shows the variation of strategy frequencies and 
illustrates five consecutive phase transitions (at /3ci, /3c2, 
/3c5) when the fine is increased at 7 = 0.005. 
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FIG. 7: (Color online) Strategy frequencies as a function of fine in the 
week peer punishment limit (m = 1/{G — 1)) for a low punishment 
cost (7 = 0.005) and r — 3.8. Inset features the enlargement of the 
small-fine area. 

The visualization of the time-dependence of spatial strat- 
egy distribution has helped us understand what happens and 
the characteristic mechanisms can be summarized as follows. 
If the system is started from a random initial state then after a 
short relaxation process we can observe a sea of defectors with 
homogeneous islands of cooperative strategies (C, O, and E) 
for low values of /?. Due to the stochastic dynamics the islands 
grow and shrink at random and sometimes they can disappear, 
unite, or split into two. In the late stage of the evolutionary 
process the pattern formation can be considered as a competi- 
tion among three two-strategy associations (denoted as D+C, 
D+O, and D+E) representing the corresponding stationary so- 



6 



lutions of subsystems where only two strategies take place [2]. 
Evidently, the D+C solution can invade the other two asso- 
ciations for infinitesimally small values of fine, because Cs 
are not charged by the cost of punishment. The increase of 
fine, however, favors the survival of the O and E strategies. 
As a result, the average frequency of the punishing strategies 
ipo and pe) increases with the fine in the corresponding two- 
strategy phases (Dh-O and Dh-E) while pc remains constant in 
the phase Dh-C. The mentioned variations modify the relation- 
ship among the three two-strategy solutions. The MC simula- 
tions indicate that the Dh-E phase conquers the whole system 
if Pel < /3 < f3c2 and the Dh-O phase can be observed in the 
final state if (3c2 < (3 < /3c3- In the latter two phases the fre- 
quency of defectors decreases monotonously with the fine and 
the punisher islands are simultaneously separated by channels 
becoming narrower. In parallel with this process the punishing 
islands unite more frequently enforcing the relevance of direct 
competition between E and O that boosts the spreading of E. 
The latter effect helps peer punishers to survive in the three- 
strategy phase D+O+E within the region /3c3 < f3 < (3ci- 
In the following region of fine (/3c4 < P < Pc^) the direct 
E invasion sweeps out all the pool punishers and the system 
develops into the phase D+E where the defector frequency ap- 
proaches at /3c5. If /? > /3c5 then the system evolves into the 
phase E as detailed above. 

The general behavior of the four-strategy system at r = 3.5 
is similar to those described above except the missing D+C 
phase in the low-fine limit. Figure 8 shows that pool punishers 
can survive with defectors both in the absence or presence of 
peer punishers at a sufficiently low cost. Otherwise the phase 
diagram is identical to the result achieved in the absence of 
pool punisher 
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FIG. 8: (Color online) Phase diagram for the weak peer punishment 
limit (m = 1/(G - 1)) at r = 3.5. 

Significantly different and more complex solutions are 
found at low synergy factor r offering a modest efficiency of 
investment payed into the common pool. The phase diagram 
for r = 2 is plotted in Fig. 9. In agreement with the previous 
results some parts of the corresponding phase diagram is iden- 
tical with those one can obtain if only one type of punishment 
is allowed. For example, at high fine values, the E strategy 



conquers not only D but O strategies as well, and the solu- 
tion reproduces the cases when the player can choose only D, 
C, or E strategy. This feature is related to an earlier observa- 
tion indicating that the increase of fine would not necessarily 
help the invasion of O strategy meanwhile peer punishers are 
unequivocally supported and conquer the system if (3 is en- 
hanced. 

On the other hand, one can observe striking similarity with 
the previous results of a simpler model [29] obtained in the 
absence of peer punishers. This happens in the low-fine re- 
gion where Es, cannot fight efficiently against D and die out 
within a transient period. Accordingly, in this region of the 
/J - 7 plane D+O, (D+C+DO)c, and (D+C+0)c phases are 
identified (as detailed in [29]) where the subscript "c" refers 
to self-organizing spatial strategy distribution maintained by 
cyclic dominance on the analogy of evolutionary rock-paper- 
scissors games. 



0.3 



1 1 1 1 1 

D 




/ °h(0) 




. (D+C+DO)j, //(D+C+0)g y /' /' 


E ■ 






D+O // /\__^ -"7^ ^' 




J^' (D+0+E)j, , ' ' 









0.5 0.6 0.7 0.8 0.9 1 1.1 
fine 

FIG. 9: (Color online) Cost-fine phase diagram for the weak peer 
punishment limit (m = 1/(G — 1)) at r = 2.0. 

The coexistence of both types of punishments occurs in the 
phases (D+O+E) c and D+C+O+E indicated in the /3 — 7 phase 
diagram (see Fig. 9). Within the phase (D+0+E)c three strate- 
gies dominate cyclically each other (namely, D beats E beats 
O beats D) and form a self-organizing spatial pattern. At these 
parameter values the (D+C+0)c phase is also a possible solu- 
tions. 



A. Stability analyses 

In the present four-strategy model, however, the (D+0+E)c 
coalition (with proper spatio-temporal pattern) is more sta- 
ble and capable to invade the territory of other solutions as 
demonstrated by consecutive snapshots in Fig. 10. For this 
goal the whole system is divided into large rectangular regions 
with proper periodic boundary conditions (PBC) for each box 
during a relaxation time. Within each box only three strategies 
[D+C+0 or D+O+E] are placed randomly in the initial state. 
After a suitable relaxation time the proper PBCs are removed 
and simultaneously the usual PBC is switched on. This trick 
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has allowed us to visualize the spatial competition between 
the solutions (D+C+0)c (left) and (D+O+E)^ (right). 




D 

/]\ 

CD ME 

\/ 

m 

o 

FIG. 10: (Color online) Spatial competition between two solutions of 
three- strategy subsystems in the week peer punishment limit (m — 
1/(G - 1)) for r = 2.0, /3 = 0.7, and 7 = 0.025. Before inva- 
sions are allowed at f = MCS along the vertical interfaces, both 
stationary solutions [(D-l-C-l-O)c (left) and (D-l-O-l-E)c (right)] have 
been developed without disturbing each other in the corresponding 
regions. Snapshots of L = 400 x 400 part of a L = 800 x 800 
system are taken att = MCS (a), 200 MCS (b), 1000 MCS (c), 
and from the stationary state (d). Lower panel shows the colors of 
strategies and their relations at these parameters (pointed by an ar- 
row towards the one who is invaded by the other). These are black 
for D, white for C, blue (dark gray) for O, and orange (light gray) 
for£;. 

If the system size is large enough then there is always a 
chance that all the possible solutions can emerge locally some- 
where in the system and the most stable solution can finally 
prevail throughout an invasion process in the whole system. 
The latter expectation is not necessarily satisfied particularly 
if the system size is small (such as L < 2000 for c « 0.02, 
r = 2). Besides it the "small" size of the system also limits 
the characteristic size of patterns and prevents the formation 
of phases including significantly larger correlation lengths. 
These are the reasons why one cannot achieve reliable MC 



results on small systems when analyzing the system behavior 
in the vicinity of a critical point where the coiTelation length 
diverges [62]. Further difficulties arise from the fact that the 
present spatio-temporal patterns can be characterized by two 
or more length scales preventing the straightforward applica- 
tion of methods (e.g., finite-size scaling) developed in statisti- 
cal physics for the investigation of simpler systems [63]. 

Besides it, the small size decreases the probability of the 
emergence of phases requiring longer relaxation throughout 
a complex evolutionary process. Figure 1 1 demonstrates the 
related difficulties of numerical simulations we faced when 
studying this system for sizes as large as L = 5000. Despite 
of the large system size the final state is still ambiguous if the 
system is started from a random initial state. In most cases the 
system evolves to either D or O state as demonstrated by the 
upper two plots of Fig. 11. Only a very few runs result in a 
third type (D+C+0)c phase. In order to justify the stability of 
the (Dh-Ch-O)c phase we have performed further stability anal- 
yses. Namely, by starting from a three-strategy initial state the 
stochastic evolution of the (D+C+0)c phase is interrupted at a 
time (indicated by an arrow in the bottom plot of Fig. 11) and 
half of the system is replaced by a large domain of E phase and 
afterwards the simulation is continued. The time-dependence 
of the strategy frequencies quantify how the original solution 
is restored. Similar analysis can be done to justify the superi- 
ority of (Dh-Ch-O)c phase over the O phase. It is worth men- 
tioning that this includible analysis is not time-consuming due 
to smaller system size used in simulations. Furthermore, such 
a conclusive test cannot be avoided when the model contains 
more than three competing strategies. 

Now we discuss two (perpendicular) cross-sections of the 
cost-fine phase diagram at r 2 (Fig. 9) where the com- 
petition between the two punishing strategies plays relevant 
role. The upper plot of Fig. 12 shows the variation of strat- 
egy frequencies in the stationary state when the fine is var- 
ied from /? = 0.8 to /3 — 1.0 at a fixed cost. The reader 
can observe that the four-strategy Dh-Ch-Oh-E phase occurs via 
a continuous transition from the phase (Dh-Ch-O)c when in- 
creasing the fine and subsequently it transforms abruptly into 
the phase D/j(o) where only defectors are present in the fi- 
nal stationary state. As well as previously, the subscript of 
the notation Dh(^o) refers to homoclinic instability being dif- 
ferent from those discussed in the previous section. In the 
present case the homogeneous D phase can be invaded by the 
offspring of pool punishers if they help each other by form- 
ing a sufficiently large domain. At the same time the growing 
domain of pool punishers can be eliminated by the offspring 
of either a single cooperator or peer punisher who is inserted 
into the teiTitory of pool punishers as a mutant created with 
an arbitrarily small rate. For both cases defectors play the role 
of tertius gaudens and prevail the whole population. For the 
given cost the peer punishers can beat defectors (with or with- 
out the presence of others) if (3 > 0.978(2) when the system 
evolves into the phase E. 

The lower plot of Fig. 12 illustrates three consecutive phase 
transitions when increasing 7 from to 0. 1 for a fixed value of 
fine (/3 = 0.8). Notice that within the four strategy phase the 
frequency of cooperators is low {pc < 0.1). Despite of the 



8 



10° 
10-1 
10-2 

10-3 
10-4 
10-5 
10-"^ 

10° 
10-1 
10-2 

10-3 

10-"* 
10-5 
10-"^ 



1 1 TT 1 1 TT 1 1 


1 w-i 1 1 w-i 1 1 








/ 

\ / 
\ \ 
\ \ 








\ \ 
\ \ 

\ \/ 

\ 

\ 

\ 

1 

\ 

, , , , , . 


/ ^ 







10^ 



10' 



10" 



10^ 



10^ 



10^ 



time [MCS] 



1 — 1 — 1 1 — 1 — 1 1 — I — 1 ' — ■ — 


— — ^ / \ 
\\\ / \ 








\ \ \ 




\ \ \ 
\ \ 




\ \ 
c ^ \e 

\ \ 




\ ^ 






). 







10^ 



lO' 10^ lo' 

time [MCS] 



10" 




-20000 -10000 10000 

time [MCS] 



20000 



FIG. 11: (Color online) The upper two plots show evolutionary 
processes within the region of (D-l-C-l-O)c phase when the system 
is started from random initial state for L = 5000,using identical 
r = 2.0, 13 = 0.78, 7 = 0.1, and m = 1/(G - 1) parameter val- 
ues. The bottom plot demonstrates the stability of (D-l-C-l-O)c phase 
if we insert a large E domain into the given state at t = MCS (here 
L = 1200). 



low values of pc the presence of cooperators influences the 
efficiency of punishing strategies in a complex way indicated 
by Figs. 12. 
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FIG. 12: (Color online) Strategy frequencies vs. fine if 7 = 0.1 
(upper plot) and vs. cost if /3 = 0.8 (bottom plot) form = 1/(G— 1) 
and r = 2.0. Notation of phases is indicated at the top. 



V. CONCLUSIONS 

In this work we have compared the efficiency of pool (insti- 
tutional) and peer (individual) punishments within the frame- 
work of spatial public goods game when the strategy evolution 
is controlled by stochastic imitation (resembling Darwinian 
selection). This study is considered as an initial effort to un- 
derstand why some societies rely mainly on peer punishment 
and others prefer pool punishments. As a general conclusion, 
the output in structured population may depend sensitively on 
the parameter values characterize the relation of punishment 
strategies. 

Both types of punishment are applied by cooperative play- 
ers in different ways. The present four-strategy model exhibits 
a wide variety in the final stationary behavior in the limit of 
infinitely large system size when tuning the model parame- 
ters (synergy factor, cost and fine of punishment) for a fixed 
level of noise. In many cases the peer punisher strategy seems 
to be more efficient in the elimination of the "tragedy of the 
commons" when all players choose defection. The numeri- 
cal analysis allowed us to identify phases where both types of 
punishments coexist, sometimes together with the (pure) co- 
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operators weakening the efficiency of punishment. We have 
found regions in the plane of parameters where the compe- 
tition between the different punishments helped defectors to 
prevail the whole system. 

Finally we emphasize some additional and general conclu- 
sions extracted during the numerical analysis of the present 
four-strategy evolutionary game on a square lattice. Namely, 
we have observed an interesting phase where the spreading of 
one of the strategies (here cooperation) is favored along an in- 
terface and the resultant monolayer can reverse the direction 
of invasion between the homogeneous domains separated. We 
think that the structure of the present interactions (namely, the 
players' income are accumulated from five five-person games) 
provides convenient conditions for studying these types of 
self-organizing patterns. Furthermore, we should stress the 
technical difficulties in the evaluation of phase diagrams de- 
scribing the boundary between distinguishable stationary be- 



haviors in the limit L ^ oc. It turned out that using the con- 
cepts of competing associations [2] we should check the direc- 
tion of invasions between most of the pair of solutions char- 
acterizing the spatio-temporal patterns for all possible sub- 
systems if we wish to avoid artifacts related to the complex 
finite-size effects. At the same time the application of this 
approach may enhance the accuracy and efficiency of the nu- 
merical investigations when quantifying the phase boundaries 
in the large-size limit. Evidently, the systematic investigation 
of the finite size effect and also the expansion of a solution in 
another subsystem solution are inevitable in similar complex 
systems. 

We thank Karl Sigmund for initiating the present investiga- 
tions and stimulating discussions. This work was supported 
by the Hungarian National Research Fund (grant K-73449), 
the Bolyai Research Grant, and the COST Action MP0801 
(Physics of Competition and Conflicts). 
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